6 research outputs found

    Visual decisions in the analysis of customers online shopping behavior

    Get PDF
    The analysis of the online customer shopping behavior is an important task nowadays, which allows maximizing the efficiency of advertising campaigns and increasing the return of investment for advertisers. The analysis results of online customer shopping behavior are usually reviewed and understood by a non-technical person; therefore the results must be displayed in the easiest possible way. The online shopping data is multidimensional and consists of both numerical and categorical data. In this paper, an approach has been proposed for the visual analysis of the online shopping data and their relevance. It integrates several multidimensional data visualization methods of different nature. The results of the visual analysis of numerical data are combined with the categorical data values. Based on the visualization results, the decisions on the advertising campaign could be taken in order to increase the return of investment and attract more customers to buy in the online e-shop

    Tikimybinis dažnų posekių paieškos algoritmas

    Get PDF
    Dažnų posekių paieška didelėse duomenų bazėse yra svarbi biologinių, klimato, fi nansinių ir daugelio kitų duomenų bazių analizei. Tikslieji algoritmai, skirti dažnų posekių paieškai, daug kartų perrenka visą duomenų bazę. Jeigu duomenų bazė didelė, tai dažnų posekių paieška yra lėta arba reikalingi superkompiuteriai. Straipsnyje pasiūlytas naujas tikimybinis dažnų posekių paieškos algoritmas, kuris analizuoja tam tikru būdu sudarytą pradinės duomenų bazės atsitiktinę imtį. Remiantis šia analizedaromos statistinės išvados apie dažnus posekius pradinėje duomenų bazėje. Šis algoritmas nėra tikslus, tačiau veikia daug greičiau negu tikslieji algoritmai ir tinka žvalgomajai statistinei analizei. Tikimybinio algoritmo klaidų tikimybės įvertinamos statistiniais metodais. Tikimybinis algoritmas gali būti derinamas su tiksliaisiais dažnų posekių paieškos algoritmais. Jį galima taikyti ir bendrajam struktūrų paieškos uždaviniui.Probabilistic Algorithm for Mining Frequent SequencesJulija Pragarauskaitė, Gintautas Dzemyda SummaryFrequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, fi nancial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requires supercomputers. A new probabilistic algorithm for mining frequent sequences is proposed. It analyzes a random sample of the initial database. The algorithm makes decisions about the initial database according to the random sample analysis results and performs much faster than the exact mining algorithms. The probability of errors made by the probabilistic algorithm is estimated using statistical methods. The algorithm can be used together with the exact frequent sequence mining algorithms

    Frequent pattern analysis for decision making in big data

    No full text
    Huge amounts of digital information are stored in the World today and the amount is increasing by quintillion bytes every day. Approximate data mining algorithms are very important to efficiently deal with such amounts of data due to the computation speed required by various real-world applications, whereas exact data mining methods tend to be slow and are best employed where the precise results are of the highest important. This thesis focuses on several data mining tasks related to analysis of big data: frequent pattern mining and visual representation. For mining frequent patterns in big data, three novel approximate methods are proposed and evaluated on real and artificial databases: • Random Sampling Method (RSM) creates a random sample from the original database and makes assumptions on the frequent and rare sequences based on the analysis results of the random sample. A significant benefit is a theoretical estimate of classification errors made by this method using standard statistical methods. • Multiple Re-sampling Method (MRM) is an improved version of RSM method with a re-sampling strategy that decreases the probability to incorrectly classify the sequences as frequent or rare. • Markov Property Based Method (MPBM) relies upon the Markov property. MPBM requires reading the original database several times (the number equals to the order of the Markov process) and then calculates the empirical frequencies using the Markov property. For visual representation of big data, the behaviour of Internet user was analyzed using geometric methods, multidimensional scaling and artificial neural networks. SOM visualization showed the best results; however it requires some training and experience by a user, who is analyzing the data

    Visual decisions in the analysis of customers online shopping behavior

    No full text
    The analysis of the online customer shopping behavior is an important task nowadays, which allows maximizing the efficiency of advertising campaigns and increasing the return of investment for advertisers. The analysis results of online customer shopping behavior are usually reviewed and understood by a non-technical person; therefore the results must be displayed in the easiest possible way. The online shopping data is multidimensional and consists of both numerical and categorical data. In this paper, an approach has been proposed for the visual analysis of the online shopping data and their relevance. It integrates several multidimensional data visualization methods of different nature. The results of the visual analysis of numerical data are combined with the categorical data values. Based on the visualization results, the decisions on the advertising campaign could be taken in order to increase the return of investment and attract more customers to buy in the online e-shop
    corecore